For the LondonGangNet network, do the following:
LondonGangNet and plot a random network with
the same number of edges.LondonGangNet differ
from the random network?Prison.Prison. Interpret the coefficient.Convictions. Interpret the coefficient.Birthplace. What is the
interpretation of this matrix?Birthplace are more likely to form ties.Birthplace?Age are
more likely to form ties.Age?Prison or (and?)
Convictions ERGMs (i.e. #5 or/and #6).LondonGangNet, identifying nodes based on
Prison or Conviction in both plots.Birthplace or (and?)
Age ERGMs (i.e. #8 or/and #10).LondonGangNet, identifying nodes based on
Birthplace or Age in both plots.For the trustNet network, do the following:
trustNet and plot a random network with the
same number of edges.trustNet differ from
the random network?White.YearsOnUnit.White. Interpret the coefficient.YearsOnUnit. Interpret the coefficient.White. What is the
interpretation of this matrix?White (i.e. homophily for the attribute
White).trustNet,
identifying nodes based on White or (and?)
YearsOnUnit in both plots.First, let’s read in the LondonGangNet network. This is
stored as an .rds document in the data folder on the SNA
Textbook site. We will use the readRDS() function, with
the file
path, to load the file. Since we are calling a url, we need to use
the url() function as well.
Finally, we need to make sure the sna and
network packages are loaded, using library(),
so that R recognizes the LondonGangNet object as one of
class network.
# load the libraries we need
library( sna )
library( network )
library( ergm )
# define the path location for the file
loc <- "https://github.com/jacobtnyoung/sna-textbook/raw/main/data/data-london-gang-net.rds"
LondonGangNet <- readRDS( url(loc ) )
# look at the network
LondonGangNet## Network attributes:
## vertices = 54
## directed = FALSE
## hyper = FALSE
## loops = FALSE
## multiple = FALSE
## bipartite = FALSE
## total edges= 315
## missing edges= 0
## non-missing edges= 315
##
## Vertex attribute names:
## Age Arrests Birthplace Convictions Music Prison Ranking Residence vertex.names
##
## No edge attributes
LondonGangNet and plot a random network
with the same number of edges.First, let’s set up our random network. Recall that we want to
generate a random network with the same number of edges and density as
the LondonGangNet object.
# set the seed to reproduce these results
set.seed( 605 )
# generate the random graph
random.graph <- rgraph(
dim( as.matrix( LondonGangNet ) )[1],
1,
tprob = sum( as.matrix( LondonGangNet ) )/2 / ( dim( as.matrix( LondonGangNet ) )[1] *( dim( as.matrix( LondonGangNet ) )[1] - 1 ) / 2 ) ,
mode = "graph"
)
# now coerce the random graph to a network object
random.net <- as.network( random.graph, directed = FALSE )Now, we can plot them both:
# set the margins
par( mfrow=c( 1,2 ),
mar=c( 0.1, 0.5, 2, 0.5 ) )
# set the seed
set.seed( 605 )
# create the first plot
gplot(
LondonGangNet,
gmode = "graph",
edge.col="grey40",
vertex.col="#c78c71",
coord = coords,
main = "London Gang Network"
)
# create the second plot
gplot(
random.net,
gmode = "graph",
edge.col = "grey40",
vertex.col="#069e6e",
main = "Random network"
)LondonGangNet
differ from the random network?The most striking difference is in the distribution of the edges over
the nodes. Even though the number of edges are the same in each graph,
the random graph looks, well, more random, whereas the
LondonGangNet shows a core group with many edges and some
peripheral nodes with fewer edges. Put differently, the random graph
does not reproduce the variation in the degree distribution that is
observable in the plot of the LondonGangNet.
## Call:
## ergm(formula = LondonGangNet ~ edges)
##
## Maximum Likelihood Results:
##
## Estimate Std. Error MCMC % z value Pr(>|z|)
## edges -1.2649 0.0638 0 -19.83 <1e-04 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Null Deviance: 1984 on 1431 degrees of freedom
## Residual Deviance: 1508 on 1430 degrees of freedom
##
## AIC: 1510 BIC: 1516 (Smaller is better. MC Std. Err. = 0)
## edges
## 0.2201258
The probability of a tie in this model is 0.22. This is the same as the density of the network, 0.22
Prison.The mean degree for those who have been to prison is
12.71, while the mean degree for those who have not
been to prison is 10.83. This means that individuals who have been to
prison (i.e. LondonGangNet %v% "Prison" == 1) have, on
average, roughly two more edges than those who have not been to
prison.
Prison. Interpret the coefficient.# estimate the model
prison.LGN <- ergm(
LondonGangNet ~ edges
+ nodefactor( "Prison" )
)
# summarize the model
summary( prison.LGN )## Call:
## ergm(formula = LondonGangNet ~ edges + nodefactor("Prison"))
##
## Maximum Likelihood Results:
##
## Estimate Std. Error MCMC % z value Pr(>|z|)
## edges -1.45708 0.10762 0 -13.539 <1e-04 ***
## nodefactor.Prison.1 0.20948 0.09158 0 2.287 0.0222 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Null Deviance: 1984 on 1431 degrees of freedom
## Residual Deviance: 1503 on 1429 degrees of freedom
##
## AIC: 1507 BIC: 1518 (Smaller is better. MC Std. Err. = 0)
The sign of the coefficient is positive, indicating that the
probability of a tie increases if we toggle from Prison
being 0 to Prison being a 1.
More specifically, we can calculate the predicted probability of a tie between i and j if they both have been to prison using the coefficient of 0.21:
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(\theta_{edges} \times \delta_{edges} + \theta_{prison} \times \delta_{prison})\)
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic((-1.46 \times 1) + (0.21 \times 2))\)
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(-1.46 + 0.42)\)
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(-1.04) = 0.26\)
The value of 0.26 is the predicted probability of a tie.
Convictions. Interpret the
coefficient.# estimate the model
convictions.LGN <- ergm(
LondonGangNet ~ edges
+ nodecov( "Convictions" )
)
# summarize the model
summary( convictions.LGN )## Call:
## ergm(formula = LondonGangNet ~ edges + nodecov("Convictions"))
##
## Maximum Likelihood Results:
##
## Estimate Std. Error MCMC % z value Pr(>|z|)
## edges -1.38274 0.12470 0 -11.088 <1e-04 ***
## nodecov.Convictions 0.01385 0.01245 0 1.113 0.266
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Null Deviance: 1984 on 1431 degrees of freedom
## Residual Deviance: 1507 on 1429 degrees of freedom
##
## AIC: 1511 BIC: 1522 (Smaller is better. MC Std. Err. = 0)
The sign of the coefficient is positive, but not significantly
different from zero. This indicates that the probability of a tie, based
on changing Convictions from 0 to 1 does not differ from
what we may observe if ties formed at random.
Birthplace. What is
the interpretation of this matrix?## 1 2 3 4
## 1 35 38 48 26
## 2 38 23 48 16
## 3 48 48 46 27
## 4 26 16 27 8
The diagonal represents homophilous dyads. The off-diagonal
represents heterophilous dyads. Based on a visual inspection of the
table, there does not appear to be a tendency toward homophily based on
Birthplace.
Birthplace are more likely to form ties.# estimate the model using the nodematch term
homophily.birthplace.LGN <- ergm(
LondonGangNet ~ edges
+ nodematch( "Birthplace" )
)
# print the model output
summary( homophily.birthplace.LGN )## Call:
## ergm(formula = LondonGangNet ~ edges + nodematch("Birthplace"))
##
## Maximum Likelihood Results:
##
## Estimate Std. Error MCMC % z value Pr(>|z|)
## edges -1.37764 0.07854 0 -17.541 < 1e-04 ***
## nodematch.Birthplace 0.35634 0.13532 0 2.633 0.00846 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Null Deviance: 1984 on 1431 degrees of freedom
## Residual Deviance: 1502 on 1429 degrees of freedom
##
## AIC: 1506 BIC: 1516 (Smaller is better. MC Std. Err. = 0)
Birthplace?Using the coefficient of
round( homophily.birthplace.LGN$coefficients[2], 2 ), the
predicted probability of an edge between nodes with the
same Birthplace is:
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(\theta_{edges} \times \delta_{edges} + (\theta_{Birthplace} \times \delta_{Birthplace}))\)
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(-1.38 \times 1 + 1.5 \times 1)\)
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(-1.38 + 1.5)\)
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(0.12) = 0.53\)
Note that the model above tests for uniform homophily. That
is, that homophily is the same over the categories of
Birthplace. We can test for differential homophily
by setting the diff= argument to TRUE in the
nodematch() term.
# estimate the model using the nodematch term
homophily.d.LGN <- ergm(
LondonGangNet ~ edges
+ nodematch( "Birthplace", diff = TRUE )
)
# print the model output
summary( homophily.d.LGN )## Call:
## ergm(formula = LondonGangNet ~ edges + nodematch("Birthplace",
## diff = TRUE))
##
## Maximum Likelihood Results:
##
## Estimate Std. Error MCMC % z value Pr(>|z|)
## edges -1.37764 0.07854 0 -17.541 < 1e-04 ***
## nodematch.Birthplace.1 1.49900 0.25884 0 5.791 < 1e-04 ***
## nodematch.Birthplace.2 0.75193 0.27000 0 2.785 0.00535 **
## nodematch.Birthplace.3 -0.23180 0.17960 0 -1.291 0.19682
## nodematch.Birthplace.4 1.51117 0.52347 0 2.887 0.00389 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Null Deviance: 1984 on 1431 degrees of freedom
## Residual Deviance: 1459 on 1426 degrees of freedom
##
## AIC: 1469 BIC: 1495 (Smaller is better. MC Std. Err. = 0)
Looking at the model output we see that there is considerable differences across the categories in terms of homophily.
Age
are more likely to form ties.# estimate the model using the absdiff term
homophily.age.LGN <- ergm(
LondonGangNet ~ edges
+ absdiff( "Age" )
)
# print the model output
summary( homophily.age.LGN )## Call:
## ergm(formula = LondonGangNet ~ edges + absdiff("Age"))
##
## Maximum Likelihood Results:
##
## Estimate Std. Error MCMC % z value Pr(>|z|)
## edges -0.97013 0.09910 0 -9.789 < 1e-04 ***
## absdiff.Age -0.10632 0.02901 0 -3.665 0.000247 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Null Deviance: 1984 on 1431 degrees of freedom
## Residual Deviance: 1494 on 1429 degrees of freedom
##
## AIC: 1498 BIC: 1509 (Smaller is better. MC Std. Err. = 0)
Age?Using the coefficient of -0.11, if two individuals are the
same Age, the predicted probability of a
tie between i and j is:
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(\theta_{edges} \times \delta_{edges} + \theta_{age} \times \delta_{age})\)
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic((-0.97 \times 1) + (-0.11 \times 0))\)
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(-0.97 + 0)\)
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(-0.97) = 0.27\)
As the negative coefficient indicates, the predicated probability of an edge between two nodes decreases as the difference in their age increases.
Prison or (and?)
Convictions ERGMs (i.e. #5 or/and #6).# simulate from the prison ergm
sim.prison.LGN <- simulate(
prison.LGN,
nsim=1,
seed = 605
)
# simulate from the conviction ergm
sim.convictions.LGN <- simulate(
convictions.LGN,
nsim=1,
seed = 605
)LondonGangNet, identifying nodes based on
Prison or Conviction in both plots.# use the rescale function for the plot
rescale <- function( nchar, low, high ){
min_d <- min( nchar )
max_d <- max( nchar )
rscl <- ( ( high - low )*( nchar - min_d ) ) / ( max_d - min_d ) + low
rscl
}
# set the margins
par( mfrow=c( 2,2 ),
mar=c( 0.1, 0.5, 2, 0.5 ) )
# set the seed
set.seed( 605 )
# create the first plot
gplot(
LondonGangNet,
gmode = "graph",
edge.col="grey40",
vertex.col="#f71505",
vertex.sides = LondonGangNet %v% "Prison" + 3,
vertex.cex = rescale( LondonGangNet %v% "Convictions", 0.5, 2 ),
coord = coords,
main = "London Gang Network"
)
# create the second plot for the simulated network
gplot(
sim.prison.LGN,
gmode = "graph",
edge.col = "grey40",
vertex.col="#11adf5",
vertex.sides = sim.prison.LGN %v% "Prison" + 3,
main = "Network Simulated from\n Prison ERGM"
)
# create the third plot for the simulated network
gplot(
sim.convictions.LGN,
gmode = "graph",
edge.col = "grey40",
vertex.col="#f2adf3",
vertex.cex = rescale( sim.convictions.LGN %v% "Convictions", 0.5, 2 ),
main = "Network Simulated from\n Convictions ERGM"
)Birthplace or (and?)
Age ERGMs (i.e. #8 or/and #10).# simulate from the Birthplace ergm
sim.homophily.birthplace.LGN <- simulate(
homophily.birthplace.LGN,
nsim=1,
seed = 605
)
# simulate from the Age ergm
sim.homophily.age.LGN <- simulate(
homophily.age.LGN,
nsim=1,
seed = 605
)LondonGangNet, identifying nodes based on
Birthplace or Age in both plots.# set the margins
par( mfrow=c( 2,2 ),
mar=c( 0.1, 0.5, 2, 0.5 ) )
# set the seed
set.seed( 605 )
# create the first plot
gplot(
LondonGangNet,
gmode = "graph",
edge.col="grey40",
vertex.col = LondonGangNet %v% "Birthplace",
vertex.cex = rescale( LondonGangNet %v% "Age", 0.5, 2 ),
coord = coords,
main = "London Gang Network"
)
# create the second plot for the simulated network
gplot(
sim.homophily.birthplace.LGN,
gmode = "graph",
edge.col = "grey40",
vertex.col = sim.homophily.birthplace.LGN %v% "Birthplace",
main = "Network Simulated from\n Birthplace ERGM"
)
# create the third plot for the simulated network
gplot(
sim.homophily.age.LGN,
gmode = "graph",
edge.col = "grey40",
vertex.cex = rescale( sim.homophily.age.LGN %v% "Age", 0.5, 2 ),
main = "Network Simulated from\n Age ERGM"
)# simulate the networks
prison.LGN.gof <- gof(
prison.LGN, GOF = ~degree + espartners + distance,
verbose = TRUE,
control = control.gof.ergm( seed = 605 )
)
# set the plot pane
par( mfrow = c( 2,2 ) )
# plot the results
plot( prison.LGN.gof )This model does a poor job reproducing the degree distribution and the edgewise shared partner distribution.
# simulate the networks
convictions.LGN.gof <- gof(
convictions.LGN, GOF = ~degree + espartners + distance,
verbose = TRUE,
control = control.gof.ergm( seed = 605 )
)
# set the plot pane
par( mfrow = c( 2,2 ) )
# plot the results
plot( convictions.LGN.gof )This model also does a poor job reproducing the degree distribution and the edgewise shared partner distribution.
# simulate the networks
homophily.birthplace.LGN.gof <- gof(
homophily.birthplace.LGN, GOF = ~degree + espartners + distance,
verbose = TRUE,
control = control.gof.ergm( seed = 605 )
)
# set the plot pane
par( mfrow = c( 2,2 ) )
# plot the results
plot( homophily.birthplace.LGN.gof )This model also does a poor job reproducing the degree distribution and the edgewise shared partner distribution. In particular, the edgewise shared partner distribution is very poorly recreated.
# simulate the networks
homophily.age.LGN.gof <- gof(
homophily.age.LGN, GOF = ~degree + espartners + distance,
verbose = TRUE,
control = control.gof.ergm( seed = 605 )
)
# set the plot pane
par( mfrow = c( 2,2 ) )
# plot the results
plot( homophily.age.LGN.gof )Similar to what we saw above. The main reason we are seeing this poor fits is that we do not have a term to capture the edgewise shared partners. Can you think of a term the model could include that would accomplish this?
None do a good job representing the data.
First, let’s read in the trustNet network. This is
stored as an .rds document in the data folder on the SNA
Textbook site. We will use the readRDS() function, with
the file
path, to load the file. Since we are calling a url, we need to use
the url() function as well.
Finally, we need to make sure the sna and
network packages are loaded, using library(),
so that R recognizes the trustNet object as one of class
network.
# clear the workspace since we may recycle objects below
rm( list = ls() )
# load the libraries we need
library( sna )
library( network )
library( ergm )
# define the path location for the file
loc <- "https://github.com/jacobtnyoung/sna-textbook/raw/main/data/data-WOPINS-s1-trust-net.rds"
trustNet <- readRDS( url( loc ) )
# print it out to look at it
trustNet## Network attributes:
## vertices = 131
## directed = TRUE
## hyper = FALSE
## loops = FALSE
## multiple = FALSE
## bipartite = FALSE
## total edges= 515
## missing edges= 0
## non-missing edges= 515
##
## Vertex attribute names:
## vertex.names White YearsOnUnit
##
## No edge attributes
trustNet and plot a random network with the
same number of edges.# set the seed to reproduce these results
set.seed( 605 )
# generate the random graph
random.graph <- rgraph(
dim( as.matrix( trustNet ) )[1],
1,
tprob = sum( as.matrix( trustNet ) ) / ( dim( as.matrix( trustNet ) )[1] *( dim( as.matrix( trustNet ) )[1] - 1 ) ) ,
mode = "digraph"
)
# now coerce the random graph to a network object
random.net <- as.network( random.graph, directed = TRUE )Now, we can plot them both:
# set the margins
par( mfrow=c( 1,2 ),
mar=c( 0.1, 0.5, 2, 0.5 ) )
# create the first plot
gplot(
trustNet,
gmode = "digraph",
edge.col="grey40",
vertex.col="#faa700",
coord = coords,
main = "Trust Network"
)
# create the second plot
gplot(
random.net,
gmode = "digraph",
edge.col = "grey40",
vertex.col="#069e6e",
main = "Random network"
)
trustNet differ
from the random network?Similar to what we saw above, the transitivity in the
trustNet network is not reproduced in the random graph.
## Call:
## ergm(formula = trustNet ~ edges)
##
## Maximum Likelihood Results:
##
## Estimate Std. Error MCMC % z value Pr(>|z|)
## edges -3.46786 0.04474 0 -77.51 <1e-04 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Null Deviance: 23609 on 17030 degrees of freedom
## Residual Deviance: 4618 on 17029 degrees of freedom
##
## AIC: 4620 BIC: 4628 (Smaller is better. MC Std. Err. = 0)
## edges
## 0.03024075
The probability of a tie in this model is 0.03. This is the same as the density of the network, 0.03
White.The mean indegree for those who are White is 3.44,
while the mean indegree for those who are not White is
4.59. This means that individuals who are White
(i.e. trustNet %v% "White" == 1) receive, on average,
roughly 1 less edge compared to those who are not White.
The mean outdegree for those who are White is 4.17,
while the mean outdegree for those who are not White is
3.61. This means that individuals who are White
(i.e. trustNet %v% "White" == 1) send, on average, more
edges (though it is small) compared to those who are not White.
YearsOnUnit.The mean indegree for those who are at or below the median for
YearsOnUnit is 3.24, while the mean indegree for those who
are above the median for YearsOnUnit is 4.63. This means
that individuals who have been on the unit longer receive more trust
nominations.
The mean outdegree for those who are at or below the median for
YearsOnUnit is 3.06, while the mean outdegree for those who
are above the median for YearsOnUnit is 4.82. This means
that individuals who have been on the unit longer also send more trust
nominations.
White. Interpret the coefficient.We could approach this a few different ways. One is to estimate a
nodefactor term where we do not differentiate between
incoming and outgoing ties.
# estimate the model
white.TN <- ergm(
trustNet ~ edges
+ nodefactor( "White" )
)
# summarize the model
summary( white.TN )## Call:
## ergm(formula = trustNet ~ edges + nodefactor("White"))
##
## Maximum Likelihood Results:
##
## Estimate Std. Error MCMC % z value Pr(>|z|)
## edges -3.38137 0.08387 0 -40.32 <1e-04 ***
## nodefactor.White.1 -0.07671 0.06391 0 -1.20 0.23
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Null Deviance: 23609 on 17030 degrees of freedom
## Residual Deviance: 4616 on 17028 degrees of freedom
##
## AIC: 4620 BIC: 4636 (Smaller is better. MC Std. Err. = 0)
A second approach would be to estimate separate effects for degree
differences by White using the nodeifactor and
nodeofactor terms. Let’s take a look at that model.
# estimate the model
white2.TN <- ergm(
trustNet ~ edges
+ nodeifactor( "White" )
+ nodeofactor( "White" )
)
# summarize the model
summary( white2.TN )## Call:
## ergm(formula = trustNet ~ edges + nodeifactor("White") + nodeofactor("White"))
##
## Maximum Likelihood Results:
##
## Estimate Std. Error MCMC % z value Pr(>|z|)
## edges -3.39577 0.08461 0 -40.132 < 1e-04 ***
## nodeifactor.White.1 -0.29630 0.08953 0 -3.310 0.000934 ***
## nodeofactor.White.1 0.14807 0.09163 0 1.616 0.106115
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Null Deviance: 23609 on 17030 degrees of freedom
## Residual Deviance: 4604 on 17027 degrees of freedom
##
## AIC: 4610 BIC: 4633 (Smaller is better. MC Std. Err. = 0)
First, the nodeifactor term is negative and
significantly different from zero, indicating that the probability of
receiving a tie decreases if an individual is White
compared to an individual who is not White.
More specifically, we can calculate the predicted probability of i sending a tie to j if j is White using the coefficient of -0.3:
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(\theta_{edges} \times \delta_{edges} + \theta_{White} \times \delta_{White})\)
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic((-3.39 \times 1) + (-0.3 \times 1))\)
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(-3.39 + -0.3)\)
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(-3.69) = 0.24\)
The value of 0.02 is the predicted probability of a tie.
The nodeofactor term is not significantly different from
zero, indicating that there is not a substantive difference in outdegree
based on the attribute White.
YearsOnUnit. Interpret the
coefficient.As with White, we could model the effects of
YearsOnUnit by either not considering the directionality of
the ties or we could incorporate that information. Note that since
YearsOnUnit is continuous, we use the nodecov
terms.
# estimate the model
you.TN <- ergm(
trustNet ~ edges
+ nodecov( "YearsOnUnit" )
)
# summarize the model
summary( you.TN )## Call:
## ergm(formula = trustNet ~ edges + nodecov("YearsOnUnit"))
##
## Maximum Likelihood Results:
##
## Estimate Std. Error MCMC % z value Pr(>|z|)
## edges -3.898671 0.079856 0 -48.821 <1e-04 ***
## nodecov.YearsOnUnit 0.052285 0.007261 0 7.201 <1e-04 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Null Deviance: 23609 on 17030 degrees of freedom
## Residual Deviance: 4569 on 17028 degrees of freedom
##
## AIC: 4573 BIC: 4589 (Smaller is better. MC Std. Err. = 0)
Now let’s separate them:
# estimate the model
you2.TN <- ergm(
trustNet ~ edges
+ nodeicov( "YearsOnUnit" )
+ nodeocov( "YearsOnUnit" )
)
# summarize the model
summary( you2.TN )## Call:
## ergm(formula = trustNet ~ edges + nodeicov("YearsOnUnit") + nodeocov("YearsOnUnit"))
##
## Maximum Likelihood Results:
##
## Estimate Std. Error MCMC % z value Pr(>|z|)
## edges -3.89869 0.07987 0 -48.815 <1e-04 ***
## nodeicov.YearsOnUnit 0.05809 0.01009 0 5.757 <1e-04 ***
## nodeocov.YearsOnUnit 0.04635 0.01032 0 4.493 <1e-04 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Null Deviance: 23609 on 17030 degrees of freedom
## Residual Deviance: 4568 on 17027 degrees of freedom
##
## AIC: 4574 BIC: 4598 (Smaller is better. MC Std. Err. = 0)
First, the nodeicov term is positive and significantly
different from zero, indicating that the probability of receiving a tie
increases as the number of years the individual has spent on the unit
increases.
More specifically, we can calculate the predicted probability of i sending a tie to j if j has spent the mean number of years on the unit, which is 3.72, using the coefficient of 0.06:
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(\theta_{edges} \times \delta_{edges} + \theta_{YearsOnUnit} \times \delta_{YearsOnUnit})\)
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic((-3.89 \times 1) + (0.06 \times 3.72))\)
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(-3.89 + 0.22)\)
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(-3.67) = 0.25\)
The value of 0.02 is the predicted probability of a tie.
Second, the nodeocov term is also positive and
significantly different from zero, indicating that the probability of
sending a tie increases as the number of years the individual has spent
on the unit increases.
More specifically, we can calculate the predicted probability of i sending a tie to j if i has spent the mean number of years on the unit (note the difference), which is 3.72, using the coefficient of 0.05:
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(\theta_{edges} \times \delta_{edges} + \theta_{YearsOnUnit} \times \delta_{YearsOnUnit})\)
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic((-3.89 \times 1) + (0.05 \times 3.72))\)
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(-3.89 + 0.19)\)
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(-3.7) = 0.24\)
The value of 0.02 is the predicted probability of a tie.
White. What is the
interpretation of this matrix?## To
## From 0 1 Sum
## 0 131 71 202
## 1 126 187 313
## Sum 257 258 515
Based on the mixing matrix, we can see that there are more homophilous trust ties (the diagonal) compared to heterophilous ties. Interestingly, this is mainly due to non-White individuals sending fewer ties to White individuals.
White (i.e. homophily for the attribute
White).# estimate the model
homophily.TN <- ergm(
trustNet ~ edges
+ nodematch( "White" )
)
# summarize the model
summary( homophily.TN )## Call:
## ergm(formula = trustNet ~ edges + nodematch("White"))
##
## Maximum Likelihood Results:
##
## Estimate Std. Error MCMC % z value Pr(>|z|)
## edges -3.72905 0.07210 0 -51.722 <1e-04 ***
## nodematch.White 0.46565 0.09199 0 5.062 <1e-04 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Null Deviance: 23609 on 17030 degrees of freedom
## Residual Deviance: 4591 on 17028 degrees of freedom
##
## AIC: 4595 BIC: 4611 (Smaller is better. MC Std. Err. = 0)
Using the coefficient of
round( homophily.TN$coefficients[2], 2 ), the predicted
probability of an edge between nodes who are both White
or nodes who are both non-White is:
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(\theta_{edges} \times \delta_{edges} + (\theta_{homophily} \times \delta_{homophily}))\)
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(-3.73 \times 1 + 0.47 \times 1)\)
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(-3.73 + 0.47)\)
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(-3.26) = 0.037\)
The value of 0.037 is the predicted probability of a tie.
In the model above, it is the same. But, if we incorporate differential homophily, then we have a different coefficient.
# estimate the model
homophily2.TN <- ergm(
trustNet ~ edges
+ nodematch( "White", diff=TRUE )
)
# summarize the model
summary( homophily2.TN )## Call:
## ergm(formula = trustNet ~ edges + nodematch("White", diff = TRUE))
##
## Maximum Likelihood Results:
##
## Estimate Std. Error MCMC % z value Pr(>|z|)
## edges -3.7290 0.0721 0 -51.722 < 1e-04 ***
## nodematch.White.0 0.6150 0.1148 0 5.359 < 1e-04 ***
## nodematch.White.1 0.3729 0.1036 0 3.599 0.000319 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Null Deviance: 23609 on 17030 degrees of freedom
## Residual Deviance: 4587 on 17027 degrees of freedom
##
## AIC: 4593 BIC: 4616 (Smaller is better. MC Std. Err. = 0)
Using the coefficient of
round( homophily2.TN$coefficients[2], 2 ), the predicted
probability of an edge between nodes who are both
non-White is:
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(\theta_{edges} \times \delta_{edges} + (\theta_{diffhomophily} \times \delta_{diffhomophily}))\)
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(-3.73 \times 1 + 0.62 \times 1)\)
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(-3.73 + 0.62)\)
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(-3.11) = 0.043\)
The value of 0.043 is the predicted probability of a tie.
# estimate the model by adding the mutual term
recip.TN <- ergm(
trustNet ~ edges
+ mutual,
control = control.ergm(
seed = 605 ) # here we use the control argument to set the seed to reproduce results
)
summary( recip.TN )## Call:
## ergm(formula = trustNet ~ edges + mutual, control = control.ergm(seed = 605))
##
## Monte Carlo Maximum Likelihood Results:
##
## Estimate Std. Error MCMC % z value Pr(>|z|)
## edges -3.74246 0.05186 0 -72.17 <1e-04 ***
## mutual 2.68741 0.16686 0 16.11 <1e-04 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Null Deviance: 23609 on 17030 degrees of freedom
## Residual Deviance: 4422 on 17028 degrees of freedom
##
## AIC: 4426 BIC: 4441 (Smaller is better. MC Std. Err. = 1.495)
Using the coefficient of 2.69, the predicted probability of a trust tie between i and j if a trust tie exist between j and i is:
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(\theta_{edges} \times \delta_{edges} + (\theta_{reciprocity} \times \delta_{reciprocity}))\)
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(-3.74 \times 1 + 2.69 \times 1)\)
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(-3.74 + 2.69)\)
\(\Bigg(P \Bigg(Y_{ij}=1 \> | \> n \> actors, Y_{ij}^C\Bigg) \Bigg) = logistic(-1.05) = 0.26\)
# simulate from the white ergm
sim.white2.TN <- simulate(
white2.TN,
nsim=1,
seed = 605
)
# simulate from the years on unit ergm
sim.you2.TN <- simulate(
you2.TN,
nsim=1,
seed = 605
)
# simulate from the homophily ergm
sim.homophily2.TN <- simulate(
homophily2.TN,
nsim=1,
seed = 605
)
# simulate from the reciprocity ergm
sim.recip.TN <- simulate(
recip.TN,
nsim=1,
seed = 605
)trustNet, identifying nodes based on White or
(and?) YearsOnUnit in both plots.First, we want to create a matrix of colored edges for our reciprocity ergm.
# create the symmetric matrix and colors
sympiMat <- symmetrize( trustNet, rule = "strong" )
sympiMatCols <- sympiMat
sympiMatCols[sympiMat == 0] <- "grey80"
sympiMatCols[sympiMat == 1] <- "#f71505" # set a color for mutual ties
# create the symmetric matrix for the simulation
sim.sympiMat <- symmetrize( sim.recip.TN, rule = "strong" )
sim.sympiMatCols <- sim.sympiMat
sim.sympiMatCols[sim.sympiMat == 0] <- "grey80"
sim.sympiMatCols[sim.sympiMat == 1] <- "#f78b07" # set a color for mutual tiesNow, set up our plots:
# use the rescale function for the plot
rescale <- function( nchar, low, high ){
min_d <- min( nchar )
max_d <- max( nchar )
rscl <- ( ( high - low )*( nchar - min_d ) ) / ( max_d - min_d ) + low
rscl
}
# set the margins
par( mfrow=c( 3,2 ),
mar=c( 0.1, 0.5, 2, 0.5 ) )
# set the seed
set.seed( 605 )
# create the first plot
gplot(
trustNet,
gmode = "digraph",
edge.col=sympiMatCols,
vertex.col="#f71505",
vertex.sides = trustNet %v% "White" + 3,
vertex.cex = rescale( trustNet %v% "YearsOnUnit", 0.5, 2 ),
coord = coords,
main = "Trust Network"
)
# create the second plot
gplot(
sim.white2.TN,
gmode = "digraph",
edge.col="grey40",
vertex.col="#0a9cf7",
vertex.sides = sim.white2.TN %v% "White" + 3,
#vertex.cex = rescale( sim.white2.TN %v% "YearsOnUnit", 0.5, 2 ),
coord = coords,
main = "Simulation from White ERGM"
)
# create the third plot
gplot(
sim.you2.TN,
gmode = "digraph",
edge.col="grey40",
vertex.col="#c3de12",
#vertex.sides = sim.you2.TN %v% "White" + 3,
vertex.cex = rescale( sim.you2.TN %v% "YearsOnUnit", 0.5, 2 ),
coord = coords,
main = "Simulation from Years\n on Unit ERGM"
)
# create the fourth plot
gplot(
sim.homophily2.TN,
gmode = "digraph",
edge.col="grey40",
vertex.col="#f707c3",
vertex.sides = sim.homophily2.TN %v% "White" + 3,
#vertex.cex = rescale( sim.homophily2.TN %v% "YearsOnUnit", 0.5, 2 ),
coord = coords,
main = "Simulation from Homophily ERGM"
)
# create the fifth plot
gplot(
sim.recip.TN,
gmode = "digraph",
edge.col=sim.sympiMatCols,
vertex.col="#f78b07",
#vertex.sides = sim.recip.TN %v% "White" + 3,
#vertex.cex = rescale( sim.recip.TN %v% "YearsOnUnit", 0.5, 2 ),
coord = coords,
main = "Simulation from Reciprocity ERGM"
)# simulate the networks
white2.TN.gof <- gof(
white2.TN, GOF = ~idegree + odegree + espartners + distance,
verbose = TRUE,
control = control.gof.ergm( seed = 605 )
)
# set the plot pane
par( mfrow = c( 3,2 ) )
# plot the results
plot( white2.TN.gof )# simulate the networks
you2.TN.gof <- gof(
you2.TN, GOF = ~idegree + odegree + espartners + distance,
verbose = TRUE,
control = control.gof.ergm( seed = 605 )
)
# set the plot pane
par( mfrow = c( 3,2 ) )
# plot the results
plot( you2.TN.gof )Look at the plots!